A Log-logistic Model for IR

نویسندگان

  • Stéphane Clinchant
  • Eric Gaussier
چکیده

We first present in this paper an analytical view of heuristic retrieval constraints which yields simple tests to determine whether a retrieval function satisfies the constraints or not. We then review empirical findings on word frequency distributions and the central role played by burstiness in this context. This leads us to propose a formal definition of burstiness which can be used to characterize probability distributions with respect to this phenomenon. We then introduce the family of information-based IR models which naturally captures heuristic retrieval constraints when the underlying probability distribution is bursty and propose a new IR model within this family, based on the log-logistic distribution. The experiments we conduct on several collections illustrate the good behavior of the log-logistic IR model: It significantly outperforms the Jelinek-Mercer and Dirichlet prior language models on most collections we have used, with both short and long queries and for both the MAP and the precision at 10 documents. It also compares favorably to BM25 and has similar performance to classical DFR models such as InL2 and PL2.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

مقایسه مدل کاکس و مدل های پارامتری در برآورد بقاء درمان مبتلایان سرطان پروستات تحت رادیوتراپی

Background and purpose: Prostate cancer is the second most common malignant cancer in men and radiotherapy is one of the treatments for this disease. The aim of this study was to determine the effect of demographic, clinical and pathology factors in survival rate of patients on radiotherapy and comparing different survival models to determine an efficient model. Materials and methods: In a his...

متن کامل

The Zografos–Balakrishnan-log-logistic Distribution

Tthe Zografos–Balakrishnan-log-logistic (ZBLL) distribution is a new distribution of three parameters that has been introduced by Ramos et el. [1], and They presented some properties of the new distribution such as its probability density function, The cumulative distribution function, The  moment generating function, its hazard (failure) rate function, quantiles and moments, Rényi and Shannon ...

متن کامل

Estimation of the Collection Parameter of Information Models for IR

In this paper we explore various methods to estimate the collection parameter of the information based models for ad hoc information retrieval. In previous studies, this parameter was set to the average number of documents where the word under consideration appears. We introduce here a fully formalized estimation method for both the log-logistic and the smoothed power law models that leads to i...

متن کامل

Hyperbolic Cosine Log-Logistic Distribution and Estimation of Its Parameters by Using Maximum Likelihood Bayesian and Bootstrap Methods

‎In this paper‎, ‎a new probability distribution‎, ‎based on the family of hyperbolic cosine distributions is proposed and its various statistical and reliability characteristics are investigated‎. ‎The new category of HCF distributions is obtained by combining a baseline F distribution with the hyperbolic cosine function‎. ‎Based on the base log-logistics distribution‎, ‎we introduce a new di...

متن کامل

Determining the Effective Factors on Gastric Cancer Using Frailty Model in South-East and North of Iran

Background and Purpose: Gastric cancer is the third leading cause of mortality in Iran after cardiovascular diseases and accidents. The aim of the present study was to assess survival and it’s affecting factors in gastric cancer patients through using Cox and parametric models along with frailty. Materials and Methods: In this study, the medical records of gastric cancer patients treat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011